Skip to main content

Reply to this post | See parent post | Go Back
View Post [edit]

Poster: stbalbach Date: Jun 21, 2010 1:52pm
Forum: texts Subject: Re: user tpb uploads

> The PDF scan is kept back on the Google owned server.

A copy is also on Internet Archive, click on the HTTP link you can see it hosted here.

> the tpb user ID represents a number of people, working for, and paid big bucks by Google to bulk scan books

Is this true? My understanding is tpb is an employee of IA (or friend of an employee), working "after hours", simply copying books from Google over to Internet Archive, where they can be archived forever (unlike at Google where books seem to disappear after a time).

> slaps on their own copy right, in effect, taking these books out of the public domain for ever and ever.

This is not accurate. Anything published prior to 1923 is in the public domain - intellectual property is determined by the law, not the by a copyright symbol.

Reply to this post
Reply [edit]

Poster: Time Traveller Date: Jun 21, 2010 6:56pm
Forum: texts Subject: Re: user tpb uploads

so how does tpb get access to the Google server to put the PDF there, before linking it to IA?

I am not entirely correct about how Google is tying up books in the public domain, still lots of authors are upset about Google swiping their books, there are several class actions under way from authors. etc etc At the moment I am unwell, so I can not explain myself clearer.

Its not a new copyright, when Google has books in the public domain, which should be free for all, to download and do with as they please, Google puts restrictions on, with its Terms of Use. Which may not be enforceable in a court of law, but considering how rich Google is, they got the best lawyers, a whole legal department to enforce their Terms of Use.

How many lawyers can you afford?

its a breach of Google terms of use to grab PDFs from its server, and upload to other servers, like IA.

Quite often clicking on a IA link for a book, you get directed to Google Books, and find the PDF is a pay for, and you only get to see one chapter or so.

So if tpb moves links over to the IA, where you get text only, and on line reader formats, but not the pay for PDF, tpb must have an arrangement with Google, most likly works for Google, plus is a member of the IA, in order to be able to upload.

If Google has a free to download PDF, and tpb does not belong to Google, then tpb by downloading those PDFs, is breaching Googles Terms of Use.

Furthermore, tpb is so active with downloading then uploading, so many books, he must have an official connection with Google.

But then, if he is only swiping then uploading to the IA, tpb is not doing the scanning thus not responsible for quality.

further to my argument, if tpb is not breaching Google's Terms of Use, why are there not heaps of other IA volunteers doing the same swiping from Google, then uploading to the IA.

If it was legal with Google, why is not all there inventory on the IA by now?

That's my reasoning about who/what tpb is.

This post was modified by Time Traveller on 2010-06-22 01:56:21

Reply to this post
Reply [edit]

Poster: stbalbach Date: Jun 21, 2010 9:03pm
Forum: texts Subject: Re: user tpb uploads

>how does tpb get access to the Google server to put the PDF there, before linking it to IA?

It's done with scripts automatically, anyone can access Google Books. First the script creates a blank shell on IA, uploads the books, processes it etc you can kinda follow the process by looking at the files of an example tpb book and the time stamps and contents of the xml files in the HTTP directory.

> its a breach of Google terms of use to grab PDFs from its server, and upload to other servers, like IA.

I remember reading IA has an agreement or understanding with Google to do it and they are OK with it.

>Quite often clicking on a IA link for a book, you get directed to Google Books, and find the PDF is a pay for, and you only get to see one chapter or so.

The HTTP link would have a local copy on IA. There is also usually a copy at Hathi Trust.

>why are there not heaps of other IA volunteers doing the same swiping from Google, then uploading to the IA.

The books are in the public domain. I don't see why anyone couldn't do it, other than time and effort. tpb is doing a good service, even if the quality of books is poor, better than nothing.

>why is not all there inventory on the IA by now?

It's a few million books, a lot of disk space and bandwidth to consider. Plus I don't know how automated the process is, it may just take a long time.

>uploaded by tpb on behalf of Google

I don't think tpb works on behalf of Google. My recollection is tpb is connected to IA, not Google. There was a post about it on this forum a few years ago.

Reply to this post
Reply [edit]

Poster: Time Traveller Date: Jun 22, 2010 12:20am
Forum: texts Subject: Re: user tpb uploads

that I did not know.

And does that not mean people can put malware onto Google Books?


I am close to calaspe, have to go, thanks for the information, and not giving me a hard time, I have learnt some new things about Google

Reply to this post
Reply [edit]

Poster: Time Traveller Date: Jun 21, 2010 6:51pm
Forum: texts Subject: Re: user tpb uploads

About the only possible reason for books disappearing from Google Books, once they gone to the trouble of scanning and uploading to their servers is Google being informed its in breach of copyright.

So if those books get moved over to the IA beforehand, it means the IA will also be in breach of copyright.

Reply to this post
Reply [edit]

Poster: Time Traveller Date: Jun 21, 2010 7:47pm
Forum: texts Subject: Re: user tpb uploads

There are 900,000 hits for "tpb" How many volunteers have uploaded just 100,000? I reduced from 900,000 to allow for doubles and false hits.

"Book digitized by Google and uploaded by tpb."

Not taken from Google and uploaded, but more likely digitised by Google and uploaded by tpb on behalf of Google

This post was modified by Time Traveller on 2010-06-22 02:47:09